Wrap-Up: a Trainable Discourse Module for Information Extraction

نویسندگان

  • Stephen Soderland
  • Wendy G. Lehnert
چکیده

The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse component that makes intersentential inferences and identi es logical relations among information extracted from the text. Previous corpusbased approaches were limited to lower level processing such as part-of-speech tagging, lexical disambiguation, and dictionary construction. Wrap-Up is fully trainable, and not only automatically decides what classi ers are needed, but even derives the feature set for each classi er automatically. Performance equals that of a partially trainable discourse module requiring manual customization for each domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wrap - Up : a Trainable DiscourseModule

The vast amounts of on-line text now available have led to renewed interest in Information Extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse componen...

متن کامل

Wrap - Up : a Trainable DiscourseModule for Information

The vast amounts of on-line text now available have led to renewed interest in information extraction (IE) systems that analyze unrestricted text, producing a structured representation of selected information from the text. This paper presents a novel approach that uses machine learning to acquire knowledge for some of the higher level IE processing. Wrap-Up is a trainable IE discourse componen...

متن کامل

AAAI 1995 Spring Symposium on Empirical Methods in Discourse Interpretation and Generation Learning Domain-Speci c Discourse Rules for Information Extraction

This paper describes a system that learns discourse rules for domain-speci c analysis of unrestricted text. The goal of discourse analysis in this context is to transform locally identi ed references to relevant information in the text into a coherent representation of the entire text. This involves a complex series of decisions about merging coreferential objects, ltering out irrelevant inform...

متن کامل

Learning Domain-Specific Discourse Rules for Information Extraction

This paper describes a system that learns discourse rules for domaln-speclfic analysis of unrestricted text. The goal of discourse analysis in this context is to transform locally identified references to relevant information in the text into a coherent representation of the entire text. This involves a complex series of decidons about merging coreferential objects, filtering out irrelevant inf...

متن کامل

Tools and techniques for rapid porting

Charlie Dolan, from Hughes Research Laboratories, discussed some of the difficulties in using trainable components in an information extraction system. The UMass/Hughes system used six different trainable components in their MUC5 system ; portability between the EJV and EME domains was achieved partl y through retraining these components. One of these components, the Trainable Template Generato...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Artif. Intell. Res.

دوره 2  شماره 

صفحات  -

تاریخ انتشار 1994